When pg_rewind success, the database can't startup - Mailing list pgsql-bugs
From | hemin |
---|---|
Subject | When pg_rewind success, the database can't startup |
Date | |
Msg-id | D73AFF2B-7325-4047-A325-F4B70828023B@ww-it.cn>+554133D5E253EE07 Whole thread Raw |
List | pgsql-bugs |
Dear PGer:
I use pg_rewind to avoid the WAL diverged success, but the database can’t startup, and output error “requested timeline 3 does not contain minimum recovery point 0/DB35BE80 on timeline 1”. Fallow is the detail.
Thanks !
Problem Description:
There is a primary standby cluster with async replication, when large data inserting into the primary node, we stop the database by hand. Then promote the standby node to be new primary node and insert new data into it. Finally use pg_rewind to avoid WAL diverged success, but the node can not to be startup with fallow error:
“2018-06-06 14:40:18.686 CST [2687] FATAL: requested timeline 3 does not contain minimum recovery point 0/DB35BE80 on timeline 1
2018-06-06 14:40:18.686 CST [2686] LOG: startup process (PID 2687) exited with exit code 1”
Environment: primary standby cluster with async replication, the database version is postgresql-10
Primary Node Info:
System: centos 6, IP:10.9.5.21, port 5410
Standby Node Info:
System: centos 6, IP:10.9.5.22, port: 5410
Reproduce Step:
(1) Init environment: Create a primary standby cluster with async replication, and add access role in pg_hba.conf which tool pg_rewind will be use;
(2) Primary Node: insert 1,500,000 rows data into database use pgbench:
pgbench -i -s 15 postgres
(3) Primary Node: when pgbench is insert end, and begin vacuum the database, we stop the database by hand:
pg_ctl -D $PGDATA stop
(4) Standby Node: promote the standby node to be primary:
pg_ctl -D $PGDATA promote
(5) Standby Node: inset 3,000,000 rows data into database use pgbench to:
pgbench -i -s 30 postgres
(6) Primary Node: use pg_rewind to avoid WAL diverged,:
pg_rewind --target-pgdata='/var/lib/pgsql/10/data' --source-server='host=10.9.5.22 port=5410 dbname=postgres user=postgres password=xxx’
servers diverged at WAL location 0/AEEE94D0 on timeline 1
rewinding from last common checkpoint at 0/AEEE9460 on timeline 1
Done!
(7) Primary Node: startup failed:
pg_ctl -D $PGDATA start
waiting for server to start....2018-06-06 14:40:18.194 CST [2686] LOG: listening on IPv4 address "0.0.0.0", port 5410
2018-06-06 14:40:18.194 CST [2686] LOG: listening on IPv6 address "::", port 5410
2018-06-06 14:40:18.256 CST [2686] LOG: listening on Unix socket "/tmp/.s.PGSQL.5410"
2018-06-06 14:40:18.372 CST [2687] LOG: database system was interrupted while in recovery at log time 2018-06-06 14:12:45 CST
2018-06-06 14:40:18.372 CST [2687] HINT: If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
2018-06-06 14:40:18.686 CST [2687] LOG: entering standby mode
2018-06-06 14:40:18.686 CST [2687] FATAL: requested timeline 3 does not contain minimum recovery point 0/DB35BE80 on timeline 1
2018-06-06 14:40:18.686 CST [2686] LOG: startup process (PID 2687) exited with exit code 1
2018-06-06 14:40:18.686 CST [2686] LOG: aborting startup due to startup process failure
2018-06-06 14:40:18.690 CST [2686] LOG: database system is shut down
stopped waiting
pg_ctl: could not start server
Examine the log output.
何敏
Call: 185.0821.2027 | Fax: 028.6143.1877 | Web: w3.ww-it.cn
成都文武信息技术有限公司|ChengDu WenWu Information Technology Inc.|WwIT
地址: 成都高新区天府软件园B区7栋611 |邮编:610041
pgsql-bugs by date: